On behalf of HI-TRON SYSTEMS PTD LTD 
I would like you to accept this 
complimentary pocket guide to assist 
you in planning your data capture 
and storage. We trust that both the 
pocket guide and HI-TRON can be of 


assistance to you in the future. 


Peter McAllister 
MANAGING DIRECTOR 


Computer equipment and systems continue 
to become faster, cheaper and more 
sophisticated. One major problem 
remains: data must still be entered 
through a keyboard. With the continuous 
increase in Labour costs and costs 
associated with RSI (real or imagined), 
any alternative to key entry that 
improves speed and accuracy yet reduces 
cost must be worthy of consideration. 


This hand book has been written ina 
reference format to help the end user 
and system developer understand the 
advantages and avoid the pitfalls in the 
use of both OCR and bar-code in various 
applications. 


Included are many useful appendices 
covering communications, mark reading 
and magnetic stripe reading; all areas 
where general basic information is 
Lacking. 


We hope that this compact compendium .. 
will be kept in your top drawer or 
briefcase and used in conjunction with 
your data entry planning. Any further 
information on the subjects or equipment 
availability can be obtained by 
returning the enclosed enquiry card to 
HI-TRON. 


Ian Miller 
Marketing Manager - Scanning 
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1. BAR-CODE 
1.1. Bar-codes - what are they ? 


Bar-codes are a machine readable pattern of alternating 
parallel bars and spaces, representing numbers and other 
characters. They can represent a product ID number, an order 
number or any other information that must be entered into a 
computer system. The various bar-code sybologies, like 
Languages, encode information differently so that a scanner 
that has been programmed to read only one particular code, 
cannot read another, 


INA 


1.2. The reasons for using Bar-codes 


There are many reasons for selecting bar code scanning over 
manual methods of data entry; some of the most significant 
are:- 


To reduce data entry errors - Tests by the U.S. Department 
of Defense showed bar-coding to be the most error resistant 
method of data acquisition. They found 4 errors in 1,266,444 
Lines of bar-code data - 99.9997% accuracy. 

4 
To ensure this accuracy some bar-code symbologies are 
designed to be self~checking in the sense that each 
character is individually checked to verify that it is 
properly encoded. Others include an additional check 
character. This character is a number that is calculated 
based on the position and value of all the other characters 
in the bar-code. This is all done to eliminate scanning 
errors. 


Furthermore, all of the bars and spaces within a bar-code 
symbol must conform to the specification in order to get a 
successful scan. Bar-code scanners do not "skip" over poorly 
printed characters. For example, if one character out of 15 
is encoded improperly, the entire symbol is rejected, not 
just the out of specification character. 


To consolidate data entry S8ar-code data aquisition reduces 
the following typical data entry sequence: 
Hand-written o Keypunch 


= Computer 
Document Operator 


tv 
°o 
ee 


Bar-code = Computer 
Scanner 
In essence, bar-code systems compete with Keypunch Operators. 


To provide vertical Redundancy This characteristic refers 
to the height of the bar-code. Regardless of where the 
scanner passes through the symbol, as long as it scans from 
one end to the other, the information is intact. This means 
that over 80% of a 25mm high symbol could be destroyed and 
the operator could still successfully read the data. 


Amenable to most printing techniques Bar-codes have been 
successfully printed using every known marking technique on 
a wide range of substrates. The List includes symbols 
produced by ink jet, laser and dot matrix printers. Off-set 
printers can produce symbols on substrates such as rubber, 
plastic, metal, paper and corrugated cardboard, 


Resistance to errors from printing defects The self 
checking features, the bar-code algorithms and vertical 
redundancy reduce the possibility of transposing characters 
or rejecting entire symbols because of printing defects. 


1.3. The uses of Bar-codes. 


Because bar-codes are machine-readable symbols they can be 
used where ever pre-printed information is to bé entered 
into a computer or micro-processor based system. Some of the 
more common uses of bar-codes are;- 


Monitoring work-in-progress. 
Point-of-sale product entry. 
Assembly verification. 

Order Entry. 

Controlling access to secured areas. 
Receiving and Despatch. 

Library Circulation. 


1.4. The bar-code structure. 
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. INTERPRETATION LINE 
Almost all bar-codes consist of the following: 


1.4.1.. Start and stop characters These characters are at 
the beginning and the end of the bar-code symbol. They 
indicate to the scanner the direction in which the 
information is being scanned, hence allowing bi-directional 
scanning. These characters also define the type of barcode 
being scanned and therefore allow auto-discrimination which 
enabies the more sophisticated scanner. to read many 
symbologies without re-programming. 


1.4.2. Quiet zones Immediately adjacent to the start and 
stop characters there must be an area which contains no 
markings at all. It is these quiet zones, coupled with the 
defined bars and spaces, that the scanner "recognises" as a 
legitimate bar-code. This space should be at Least 10 times 
the width of the narrow bar or space. If this distance is 
too short the scanner may not recognise the code and will 
therefore not read the symbol. 


1.4.3. Interpretation Line This is the human readable 
information printed directly beneath the bars and spaces. 
They are the characters that are encoded in the bar code 
symbol. In many cases they are printed in either OCR-A or 
OCR-B fonts. 


1.4.4. Bar/space patterns These patterns represent,in a 
machine readable format, the information in the 
interpretation line. Most bar-code symbologies have simple 
structures as they consist of only wide and narrow elements 
(bars and spaces). 


i 
The width of the narrow element is referred to as the "Xx" 
dimension. It is this dimension that is critical when 
selecting a printing method for bar-codes. The smaller the 
bar, the shorter the Length of the symbol,the closer the 
tolerances and the more difficult it is to print. Bar-code 
specifications also define the ratio that must exist between 
wide and narrow elements, e.g. 2:1, 3:1. Exceptions to the 
simple wide and narrow patterns are structurally complex 
codes Like the UPC symbol and Code 128 which have four 
different element sizes. 


1.4.5. Inter-character gap Some bar-codes, notably Code 3 
of 9, are "discrete" in the sense that each character is 
printed independently of the other characters and separated 
by a space that is not a part of the encoded character. This 
space is called the inter-character gap. However, with 
“continuous” bar-codes, all the spaces are part of the 
character and carry necessary information, therefore there 
will be no inter-character gap. 


1.4.6. Bearer bars This is a box that outlines the whole 
code, including the quiet zones. The purpose of the bearer 
is to provide support for the bars when printing on some 
substrates. For example, it is recommended that bearer bars 
be utilised when printing on corrugated cardboard. 


1.5. Encoding of information in a symbol. 


Each bar-code symbology has its own unique way of encoding 
information. Many of the popular codes are structurally 
simple in that there are anly two widths of bars and spaces. 
The wide elements are normally assigned the value "1" and 
the narrow elements the value "O". Below are two examples of 
how information is encoded. 


Code 3 of 9 Each character is represented by 9 elements - 
five bars and four spaces. three of the 9 elements are 

always wide. Here is an example of the number 9 where the 
configuration is 001100100. The characters are separated by 
an inter-character gap because Code 3 of 9 is a discrete code 


Code 3 of 9 Full ASCH 


ON 


$15.00+ 10% 


Interleaved 2 of 5 Each character is represented by 5 
elements - 5 bars or 5 spaces. Two of the 5 elements are 
wide. Any given bar/space pattern encodes two characters ~ 
one character is encoded in the bars , the other in the 
spaces. It is said that one character is "interleaved" with 
another. It is for this reason that Interleaved 2 of 5 must 
contain an even number of digits. A leading zero canbe used 
where an odd number of digits must be encoded. Here is an 
example of the number 25 - the digit 2 is encoded in black 
bars, the digit 5 encoded in spaces. The configuration for 2 
is 01001 and for 5 is 10100. Interleaved 2 of § is a 
continuous code so there is no inter-character gap. 


1.6. Code: density: 


Code denvity refers to cha pomer of characters per inch 
(25.4 mm). There are four variables that affect code density 
- type of code, ratio of wide to narrow elements, the ''X" 
dimension: and the printing technique. pe ne 


Type of code Some bar-code structures encode more: 
information per inch than others: for @xample, Interleaved 2 
of 5 can encode more numeric information than Code 3 of 9. 


Ratio of the wide to narrow elements Varying this ratio 
can change the code density of a given bar code. For 
example, Code 3 of 9 with an "X" dimension of 10 mils 
(0.25mm) will allow 6.28 characters per inch with a 3:1 
ratio and: 7.39 characters per inch with a 2.2:1 ratio. 


“x” dimension Everything else remaining equal, changing. 
the "X" dimension witl change the code density. For example, 
Code 3 of 9 at a 331 ratio will give 8.37 character per inch 
with an "X" dimension of 7.5 mils (0.188mm) and 1.57 
characters per inch at 40 mils (1mm) "X". dimension. Some 
focus only on vis aspect and classify: code density as 
follows: 


aye Dimension - 

High Density 10 mils (0.25mm) or less 
Medium density 10 mils to 20 mils (0.5mm): 
Low Density 20 mils or greater 


Code density will also be constrained by the type of 
printing technique being used to produce the symbol. 
Printing techniques vary the size of the "X" dimension, the 
wide to narrow ratio and the type of code that can be 
produced. 


1.7. The popular Bar—codes. : 


The following is a list of the more popular codes, arene 
‘with a. ‘brief description of their characteristics. 


1.7.1. Code 3 of 9 This is a structurally simple, 
discrete, variable length code. Variable length refers to 
the fact that the scanners do not have to be programmed to 
read a specific number of characters. $$ characters can be 
encoded with the standard version (0-9,a-z * % + / $ space 
-). The asterisk(*) is reserved for the start/stop 
characters. The entire 128 ASCII character set can be 
encoded in an expanded version that uses two characters to 
represent some of the special ASCII characters. For example 
the ASCII ESC is encoded using %A. 


With an "X" dimension of 7.5mils (0.188mm), a 2.2:1 wide to 
‘narrow ratio and an inter-character gap of 7.5 mils, Code 3 
of 9 can have a density of 9.85 characters per inch. 


1.7.2. Interleaved 2 of 5 This code has been adopted as 
the Shipping Container Symbol (SCS) for use by the warehouse 
and distribution industry. It is a structurally simple, 
continuous, fixed length code. In this case, fixed length 
refers to the fact that scanners must normally be set to 
read a specific number of digits. If this were not done, it 
would be possible for the gcanner to read only a portion of 
the symbol. Adding a check digit character could also reduce 
this possibility. Interleaved 2 of 5 can only encode numeric 
information. 


With an "X" dimension of 7.5 mils and a 2.2:1 wide to narrow 
ratio, Interleaved 2 of 5 can encode 17.8 characters per 
inch. 


1.7.3. Codabar This code is the symbology adopted by the 
pathology industry. It is strucurally complex, discrete, 
variable length bar-code composed of 7 elements ,with 2 or 3 
of these elements being wide. Codabar can encode 20 
characters - (0-9.$ : / . + A-D). A-D are used only as 
start/stop characters but can be used to provide information. 


With an "X" dimension of 6.5 mils (0.17mm) Codabar can 
encode 10 characters per inch. 


1.7.4. Code 128 This code is relatively new, being 
introduced in 1981. It is structurally complex, continuous, 
variable Length and requires 1 check character. There are 4 
element sizes and the entire 128 character ASCII set can be 
encoded. 


With an “X" dimension of 10 mils, Code 128 can encode up to 
18.2 numeric characters per ingh and 9.1 alpha-numer ics 
characters per inch. 


1.7.5. Code 93 This code is relatively new, being 
introduced in 1982.It is structurally complex, continuous, 
of variable length and requires 2 check characters. There 
are four different element sizes. Forty eight characters can 
be encoded with the standard version. (0-9 A-Z -.$/+%@® 
@@® OD ) The symbol [FD is reserved for the start/stop 
character. The entire 128 ASCII character set can be encoded 
using the expanded version similar to Code 3 of 9. 


With. an "X" dimension af 8 mils (0.20mm), Code 93 can encode 
13.9 characters per inch, 


1.7.6. UPC ( EAN - APR ) 

the grocery industry for scanning items at the check-out 
counter. There are a number of versions used world wide and 
due to the constraints of format would be the. most complex 
symbol in common use. 


The first self service stores in the USA were the Piggly 
Wiggly chain in 1916. However, it wasn't until 1974 that the 
first scanner using the new UPC code was installed. UPC is a 
structurally complex, continuous, fixed length code which 
also incorporates a check character. The same applies to EAN 
and APN with further information to. be found-in Appendix A. 


1.8. How a bar-code scanner works. 


Scanners contain both a light. source and a light detector. 
The scanner focuses a spot of Light on the symbol which is 
either reflected by the spaces or absorbed by the bars. The 
Light detector responds to the reflectivity of the bars ‘and 
spaces and through hardware and software logic determines if 
the pattern represents a valid bar-code. If it does, the 
symbol is decoded. 


Scanners can be programmed to read one specifié ‘bar-code 
symbology or a number of them. Changing from ore code to 
another can be done automatically by the scanner usfng / 
auto~discrimination, by the user scanning a menu that 
programmes the scanner to read a particular bar~code or by 
setting switches in the ‘scanner. ¥ 

Carbon black’ bars on a white background provide ‘thie? eee” 
contrast for the scdfiner to differentiate between the bars 
and spaces. It is not possible to determine visually if the 
proper contrast exists. For example, due to the nature of 
the technology in some scanners, red bars ona white = 
background do not provide any contrast at “atl for the 
scanner. Red is “seen” by the scanner as white. 


Visual selection cannot be used for selecting substrates 
either. Black bars on kraft board provide sufficient ~ 
contrast for scanning and yet, visually, kraft could be 
rejected as being too dark. Printing bar-code on surfaces 
with high reflectivity, such as aluminium, presents another 
problem. The aluminium appears as black to the scanner, so 
in this case the white spaces are printed to provide 
contrast. 


Industry and government specifications have established 
standards for the reflectivity of the substrate and the 
bars. In addition a measurement called The Print’ Contrast 
Signal (PCS) is also specified (see 2.10), & 


1.9. The type of scanner available. 


A useful classification of scanners is by the Light source 
used. Following is a List of the more popular Light sources 
used in commercially available equipment. 


Incandescent Light Source This is one of thé earliest 
Light sources. The scanners, using an incandescent ‘Lamp, ‘can 
be contact wands or non-contact reading devices. Some can 
scan bar-codes of various colours, including red but use 
more power than the LED tight source and require regular | 
replacement of a relatively expensive filament lamp. This 
type of wand tends to have a greater depth oF, field than'the 
LED type. 


LED Light Source This is the tight source used by bar-code 
wands. The wands can be connected to portable or fixed - | 
terminals. The distance from the Light detector to the 
bar-code is very important as this distance, the focal 
length, is fixed by the design of the wand and requires ‘that 
the tip of the wand contacts the bar-codé. Any increase or 
decrease in this distance, sometimes caused by a thick 
Lamination over the bar-code, can result in scanning 
difficulty. | 


Wands are designed with different ‘tip openings ‘or apertures, 
that must match the "X" dimension of the bar-code. Small’ 
diameter apertures, called high resoluton wands, are 
designed to read high density bar-codes - those with an "xX" 
dimension of 10 mils or bess. High density wands are more 
Likely to be affected by small specks and. voids. 

The larger diameter aperture, called low essuursen wands, 
read bar-codes with a large "X" dimension - 10 mils and ; 
greater, They are not as susceptible to scanning problems 
fron ink specs and voids, but have difficulty reading high 
density bar-codes. 


LED Light sources also vary by the kind of light that they 
emit. The two most common are the infra-red LED (930 nm) and 
the visible red LEO (633 or 700 nm). 


The infra-red LED light source consumes less power than the 
visible red Light source. The infra-red wands require that 
the bar codes be printed with high carbon content inks. 
Scanning problems will bes encountered when trying to scan 
bar-codes produced by some thermal printers and by dye inks 
as there is not enough contrast for the scanner to read the 
symbol, however, the infra-red wands are less susceptible to 
ambient Light interference than are visible red wands. 


The visible red LED tight sources, though they consume more 
power, are ideal for scanning bar codes produced by thermal 
printers, dot matrix: printers and most: other printing 
techniques. 


‘Helium Neon Laser Light Source These scanners are 
non-contact reading devices that can be connected to 
portable or fixed terminals. 


The HeNe Laser scanners are characterised: by' a large depth 
of field (from 50mm to over 1000mm in some equipment), thé 
ability to read bar-codes printed with dye inks, and scan 
rates of up to 1000 scans per second. This Last 
characteristic allows the laser scanner to read bar-codes as 
they are moving by ona high speed conveyor or to find the 
"sweet spot” in marginally printed bar-codes. - 


HeNe Laser scanners are available in a number of forms. They 
are: 


Fixed mounting - non moving beam. 
Hand held - non moving beam. 


Fixed mounting - beam oscillates in one plane. 
Nand held - beam oscillates. in one plane. 
Flat top bench mount — beam oscillates in two planes. 


The fixed mount units are usually used on a conveyor, the 
hand held units are used in stock control or retail whereas 
the bench top unit is mostly used in supermarkets. 


Laser Diode scanners These are similar to the HeNe units 
but are cheaper, more robust and take Less power to run. 
They are mainly hand held units and operate with a , 
wavelength around 780 nm. 


Slot Scanners These units are fixed mount units that scan 
when a document or card is passed through a slot. They are 
essentially a contact reader and are used mainly for reading 
security or identification cards. 


“4010. How to verify that bar-codes are scannéble. 


It is impossible from visual inspection to ascertain if a 
bar-code is scannable. There are two methods being used to 
check bar-codes. The first is to purchase a commercially 
available scanner and simply scan the symbol. This provides 
a relatively inexpensive way to verify the bar-code in 
question. However, as the scanner is of a fixed density it 
may not.match the density of every bar-code being produced. 
Another. problem with the scanner approach is that if a 
particular symbol will mot scan the reason cannot be 
defined. No feedback is provided to indicate if the PCS is 
out of spec., a character is encoded incorrectly or the 
symbol is out of tolerance. The second method employs a 
verifier/analyser that actually measures the bars, the 
spaces, the PCS and tells you whether or not they meet the 
specifications. 


The verifier makes no assumptions about the type of scanner 
being used - if the bar-code is printed within 
specifications , the scanner must be able to read it. It is 
recommended that a verifier/analyser be used when it is 
required to supply bar-codes to one of the. regulated 
industries such as the APN code. 


1.11. Printing bar-codes. 


Bar-codes have been printed by practically every known 
printing technique and on a wide variety of substrates. They 
can be printed directly on a product or applied using labels. 


Printing processes vary in their ability to print a 
consistent "X" dimension. Below are Listed some of the 
printing techniques used for printing bar-codes together 
with the corresponding range of minimum "X" dimensions. The 
range depends on quality control and the equipment 
manufacturer. tye 


Printing techniques - 
Off-site Minimum "xX" 


Flexo on labels 7.5 ~ 10.0 mils 
Flexo on corrugated 20.0 - 40.0 mils 
Offset on labels 7.5 - 10.0 mils 
Letterpress on 
corrugated » 20.0.- 40.0 mils 
Ion deposition 7.5 - 10.0 mils 
Photocomposition 7.5 - 10.0 mils 
Label Printers 

Inhouse Minimum "xX" 
Character Impact 7.5 - 9.0 mils 
Dot matrix 13.0 - 19.0 mils 
Electrostatic 10.0 - 72.0 mils 
Laser 7.5 - 10.0 mils 
Thermal 12.0 - 14.0 mils 
Ink jet. 10.0 - 12.0 mils 
Hot Stamp 7.5 - 10.0 mils 
Offset ; 8.0 - 12.0 mils 
Direct Printers Minimum "xX" 
YAG LASER 1.0 - 7.5 mils 
CO2 LASER 8.0 - 10.0 mils 
Offset 10.0 - 15.0 mils: 
Ink jet - large 

Character 100.0 -125.0 mils 
Ink jet - small 

Character 10.0 - 12.0 mils. 
Flexo 30.0 - 40.0 mils 


The chosen printing technique will be a function of many 
variables such as total cost, method of application, rate of 
application, flexibility of the system, code density, label 
size, label volume and regulations. 


2. OPTICAL CHARACTER READERS 
2.1. OCR - What is it? 


Next to keypunching Optical Character Reading is the oldest, 
most mature data entry technique in existence. Long before 
the first key-to-disk system or CRT was used, Optical 
Character Readers were entering data in commercial and 
government EDP installations. 


Over the years a number of fonts (type styles) have been 
developed by various manufacturers or regulatory agencies 
and OCR manufacturers have designed their equipment to read 
fonts that were not initially designed for scanning. In 
recent years a great deal of effort has been put into 
software that will recognise alpha-numeric hand print and 
highly de-graded machine printing. This is generally only 
available on the higher cost batch readers as the simple OCR 
wand does not have the processing power to perform the 
recognition function. 


The popularity of OCR is on the increase with the advent of 
fast microprocessors providing the vehicle for vastly 
improved recognition techniques. This can be shown in OCR 
wands now reading print that, 10 years ago, large batch 
readers would have rejected and in batch readers increasing 
both effective read rates and accuracy. 


2.2. Reasons for using OCR . 


There are a number of reasons for choosing OCR scanning over 
other methods of data entry . Some. of the more significant 
are: 


Te reduce data entry errors Tests on good quality print 
have shown that OCR can achieve less than 0.01% character 
rejection rate with a throughput of up to 200 characters per 
second (wand reader) and up to 2000 single Line documents 
per minute (document reader). 


Ta consolidate data entry The ideal application for OCR is 
that of re-entering documents produced on some form of 
character printer, such as a line or laser printer or credit 
card voucher imprinter. Where there is a requirement for 
wariable information to be added to the document, this is 
done in pre-defined hand print or mark areas which are also 
read by the scanner. 

To handle Peak Loads “Many turnaround applications are run 
on a regular cycle. This produces peaks and troughs in the 
data entry work load with the associated labour problems. 
The speed and ease of use of OCR allows the application to 
be run with a minimum of specialised staff, thereby reducing 
the peak costs. , 


Can be used with sany printing techniques OCR documents 
have been produced on a wide range of devices including: dot 
matrix, laser, ink jet and high speed line printers. 
Printing can also be produced on small portable data capture 
units and by price marking guns. 


Scanning corrections Both hand held and batch scanners have 
a width of scan that is at least 4 times the height of the 
characters and have sophisticated techniques for 
"“de-skewing” a Line of print. With modern recognition 
methods even characters that have missing sections can be 
recognised, but this does not absolve the user from ensuring 
that the print quality is maintained. Where ever possible , 
check digits should be used on the end of pre-printed fields. 


2.3. The uses of OCR 


There are many applications where the various types of OCR 
can be used. Some of the more common are Jin the following 
areas: 


Turn around billing documents (remittances) 
Banking documents and cheques 

Word processing 

Order entry using hand print 

Meter ceeding: 

Travel tickets 

Identification and security 

Document tracking 

Payroll using hand print 

Freight tracking 


Local Government and Utility rate notices 


-2.4. Fonts used in OCR applications. 


2.4.1. OCR A This is actually a group of three fonts 
having similar shapes but different sizes and proportions. 
Developed by the American National Standards Institute, OCR 
a Size I is the most widely used OCR font. This font in its 
alpha-numeric format can be read more accurately than any 
other. This is due to the highly stylized characteristic 
which maximises the difference between characters. Size I 
has a pitch of about 10 characters per inch and is typically 
used on typewriters, computer printers and price marking. 
guns. Sizes III and IV are rarely seen. 


OCR-A FULL ALPHA OCR-B LIMITED 


ABCDEFGHIJ CECMA-11) 
KLMNOPQRST ACENPSTVX | 
UVWXYZ 1234567890 
1234567890 ¥<>/ oot. 
S+<>/\" 


omy 


2.4.2. OCR B Originating in Europe, this font is closer in 
appearance to conventional type faces than OCR A. Exponents 
of the OCR & font are concerned about readability by people, 
even though it is more costly to provide the recognition 
logic to handle it. This font is also called ISO B and 
occasionally ECMA-11, When reading numeric only fielcs there 
is Little to choose between OCR A and OCR 8 but great care 
must be taken when reading OCR 8B alpha-numerics; note below 
the similarities between 0 and 0, § and 5, Z and 2, B and 8 
etc.. 


012345b189 


; a / 
2.4.3. Farrington 7B This is the font used to imprint the 
account number on bank, oil company and other credit 
vouchers. It is a numeric font with two unambiguous alpha’ 
characters. 


01234567891+$.-/ 


2.4.4. IBM 407-1 This is the "standard" type face on the 
18M 1403 printer. The first I8M OCR machine (the 1418, 
announced in 1960) read the numerics of this font. It is no 
longer in common use for scanning. 


E438 
423456 7890 


2.4.5. E-138 This was not originally an OCR font. It was 
developed by and for banks prior to the development of OCR. 
It is a highly stylized numeric font intended for printing 
in magnetic ink to facilitate the sorting and processing of 
bank cheques. Some readers Cincluding wands) read this font 
optically because of the convenience of scanning and then 
printing the amount with a MICR encoder. Occasionally may be 
seen, in a logo or science fiction title, alphabetic 
characters which- resemble the E-138 numerics. This is pure 
art work which cannot be read by machine. 


ABCDEFGHIJKLMN@PQRSTUVWX YZ 
01234567849 


2.4.6. Hand Print This is not so much a foht as a set of 
rules for forming characters. There is an ANSI standard for 
the shape of the numerals but this only applies in the USA 
with Europe and Japan having different characteristics. The 
reading of alphabetic characters requires a much tighter 
definition of the character shapes and this tends to be 
defined by the manufucturer. 

Note the tags and crosses on the alpha characters to help 
avoid recognition problems. 


Many other fonts can by read by the various Scanners; _ these 
include: 


Farrington 12L/12F, NOF, 3/16-inch Gothic, Courier, Pica, 
Elite, etc.. 


2.5. Recognising Characters 


Of the two ways to recognise characters, matrix matching is 
the simpler and more common. The logic sees each character 
as a two dimensional array of binary 1s and Os. This array 
is then compared mathematically to sample characters, or 
templates, that are similarly encoded as arrays in memory. 
Each type-face contains between 40 and 80 such templates; 
one for each letter of the alphabet(upper and Lower case), 
plus numerals and punctuation. In old OCR scanners the 
templates were permanently wired into the hardware; modern 
scanners store them in programmable circuits and firmware. 


To speed the recognition in matrix matching, as each 
character is scanned the digitised image is simulaneously 
compared with every template in the type face, Each 
comparison yields a numerical value for the "distance" 
separating the template from the sampled shape. A separate 
processor searches for the shortest distance, and designates 
the corresponding character as the best match. As ail 
templates will show some measure of distance: the amount by 
which the second closest match must be above the selected’ 
character will set the point at which a character wil be 
rejected rather than mis-read. 


Matrix matching works best when the OCR encounters a Limited 
repetoire of type styles, with the variation within each 
style. Where the characters are less predictable, feature, 
or topographical analysis is superior. In this technique, 
the OCR assembles a catalogue of details for each character 
it scans: loops, vertical and horizontal lines, line | 
crossings, line endings, bays, lagoons, and so on. Stored in 
the logic is a List of the features of each generic 
character. The "T", for example, might be summarised as"one 
vertical Line, one horizontal line, three Line ends, no 
curves". As each character is read and digitised, its 
features are compared with those of each generic character, 
It's a brute force method, requiring voluminous’ progr sane 
code, . 


Matrix matching, since it looks at the shape as an entity, 
is better suited for reading degraded print. Topographical 
analysis is more sensitive to flaws in the characters - a 
fractured character, for example, might be interpreted as 
two characters. However, topographical analysis is possibly 
the only hope for recognising the widely varying characters 
formed by the human hand. A few OCR machines presently read 
hand printed numerals but the ability to read hand lettering 
has been slower in coming. A system able to recognise hand 
printed alphanumerics with high reliability could, for 
example, automate the handling of consumer mail orders. This 
technology is now being incorporated into systems which are 
fast becoming cost effective, not only in providing part of 
the data entry solution but by providing integrated system 
solutions that can potentially solve a company's entire data 
preparation problem. 


2.6. Types of readers | 


Optical Character Readers must be classified by type of 
machine. Each type does a particular job most efficiently as 
no general purpose OCR unit can do all types of work at 
maximum efficiency. The major types of OCR machines are as 
follows: 


2.6.1. OCR Wands These inexpensive and simple to use units 
are designed for attachment to a wide variety of retail and 
EDP terminals. Interfaces are available for POS terminals 
suppiied by NCR, DTS, Kingtron, Sharp, etc. and VOUs from 
IBM, NCR, Unisys, Memorex, Telex and DEC. Self contained 
boards can also be fitted to IBM PCs Cand compatibles) and 
operate as if the data were entered from the keyboard. | 


The hand held part of the scanner contains only the Light 
source and the scanning device. The recognition and 
interface logic are contained in a separate box to which the 
wand is attached by means of a slender cable. This 
configuration is necessary in order to keep the hand held 
portion as Light as possible. 


These units only read OCR A, OCR B or E-138 and are: 
available in a slot format which allows a document, suth as 
a cheque, to be slipped past a fixed head and scanned. The 
obvious requrement with this is that all. data must be ina 
fixed position in relation with the bottom of the document. 


2.6.2. Remittance Processing Systems This type of system 
employs a low speed; manually fed OCR to read the data on a 
voucher or bill stub, It has further capabilities in that it 
permits the operator to encode the amount on the customer's 
cheque with MICR characters, prints a sequence number on the 
document for auditing purposes and accumulates and prints. 
control totals. Some systems alo incorporate a micro film 
camera in the transport mechanism so that the voucher is 
scanned, the cheque encoded and both documents microfilmed 
in the one pass. 


2.6.3. Document Readers This unit is similar to a card 
reader in that one document is equal: to one input record. 
The most common type of document read by this type of 
machine is the turnaround bill used by public utilities and 
in the insurance, retail and publishing industries where the 
customer base is the general public. The computer generated 
imformation is generally small enough to be contained on one 
tine, therefore most document readers only read that one 
line and have another method, such as hand print or marks to 
enter any variable information. Other characteristics of 
document readers are a high document throughput rate and the 
ability to handle several document sizes, but on a 
non~ intermixed basis. Documents normally have a larger 
horizontal than vertical and the data is printed (and 
scanned) parallel to the long dimension. 


2.6.4. Page readers Page readers are frequently used to 
enter new master file records, or changes to these records, 
into the computer system. 

q 


The data on a page-type form usually runs parallel to the 
short dimension and the reader st¢ans across the form. Thus 
the transport mechanism, scanner orientation and direction 
of scanning are quite different from those of a document 
reader, Some readers only scan one line at time and then 
advance the paper to scan the next Line, whereas, the faster 
readers will scan the whole cocument whist it is stationary, 
although it is still read Line by line. Rejected characters 
can either cause the machine to stop, display an image of 
the character for operator entry and then continue scanning, 
reject the document for later re-entry or continue scanning 
and capture the rejected image for later correction . 


2.6.5. Combination Document/Page Readers This type of 
machine is a.compromise that allows both documents and pages 
to be fed through the same transport mechanism. These 
machines are usually quite expensive and.document throughput 
is. usually significantly slower from that achievable on a 
Gedicated document reader. 


2.6.6. Multi-Media Data Entry This type of system combines 
OCR with more traditional key-to-disk capabilities. A single: 
multi-media system is less expensive than separate OCR and 
key-to-disk systems because it shares the cost of common 
elements Like the processor, memory and disk. On a typical 
remittance stub, for example, OCR is the ideal data-entry 
technique for reading most of the data - account number, 
balance due and minimum payment. However, key-to-disk is the 
better technique for entering the payment amount or address 
changes that may appear on the document in a form that is 
unreadable by OCR. 


2.6.7. Page Readers for Word: Processing: Input A major 
difference in scanning for word processing compared with. 
scanning for EOP is that in the WP environment, the scanned 
information will be subjected to subsequent proofing or 
editing operation, during which any rejects or errors can be 
manually corrected. This is why the word processing OCR can 
scan alpha-numeric upper and lower case plus punctuation. 
without. fear, whereas, when scanning for EDP, the data.must 
be.carrect the first time. , 


2.7. Methods of scanning 


2.7.1. Wands The scanner unit contains two incandescent 
lamps that iluminate the media. The image is. focused by a 
lens system on to a self scanning diode array. This diode 
array is scanned from top to bottom and a signal, 
corresponding to the relative value of the reflected. light 
is output to the video processor. The diode array is also 
able to indicate the direction in which the data is being 
scanned, so permit bi-directional scanning.. The data thén 
goes through the process of Character Isolation, Character 
Recognition, Data Validation and Data Transfer. 


2.7.2. Document/Page Readers There are many methods 
whereby the character image can be Lifted from the document 
as this will depend on whether the scanner reads lines or 
pages. Some of the commom methods are: 

By iLLuminat ing the Line with a very bright and even Light 
and "reading" the data with arrays of light sensitive 
diodes. The lens required for this has to be made to very 
tight tolerances. 


By passing an oscillating laser beam over the line of data 
and then read. the reflected dot through a photo-myltiplier 
tube which will amplify the received signal and pass. it on 
to logic that will store the scan for decoding. 


By brightly. illuminating the page, lifting the whole image 
through an image dissecter (similar to devices used in a 
video camera), digitising the picture, storing that 
information. then scanning the page electronically... 


2.7.3. Remittance Processing Systems The reading of MICR 
from cheques was originally designed to be read magnetically 
and is really a quite clever six channel bar-code symbology. 
The six channels are spread vertically across the character 
and the width and spacing of the black segments are read 
through a magnetic head and the characters are de-coded. 


The reason for using magnetic ink is that signatures and 
other writing could pass though the characters, hence make 
them un-readable. However, with the advances in optical 
reading the extraneous marks can be eliminated . low cost 
slot scanners are now taking the place of relatively . 
expensive magnetic readers. 
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2.8. OCR. Reading Accuracy 


OCR scanners are very accurate - much more accurate than 
human keypunch operators. Keying .is a monotonous task; the 
operator's mind may be on his/her bayfriend/girlfriend and 
therefore mistakes are made. Not in recognising individual 
characters but in keying ar, transposing what. was seen on the 
source document. OCR machines, however, are less discerning 
when it comes to identifying a distorted or incomplete 
character, hence the need to keep the quality of the data 
pretented for scanning at the highest possible level. 


2. 8.1. Reading ‘Repertoire ALL other conditions, being. 
equal, . the larger the reading repertoire of the scanner, the 
more is paid for reading accuracy. Numeric-only data can be 
read accurately for less money than alpha-numerics. Adding 
lower-case, punctuation and special. symbols either adds 
further to the, cost..or reduces the accuracy. 


_ 2.8.2. Printing Quality The quality of OCR printing from a 
high speed Line printer typically is far less than. that 
produced by a daisy wheel typewriter. The quality of print 
from dot-matrix printers tends to Lie between these, that is 
if the OCR font is available as an option at all. Laser 
printers tend to approach the quality of a daisy wheel and, 
of course, have a great speed advantage. They also have the 
ability to produce the total. form (Logo and. all) and can 
even print the boxes used for hand print. scanning. Whatever 

‘the application, the OCR scanner. will only be as good as ane 
print being presented to it. 


2.8.3. Cause and Prevention of. Rejects. . Where -the 
recognition logic does not "see" an adequate "distance" 
between two possibilities for a recognition match, the. 
character is rejected. The causes of this rejection can 
usually be traced to the printer and are as follows: 
Worn or, poor quality ribbon. 


Wavy print lines (drum printer). or merged characters (chain 
or band printer) 


Dirt on the drum or chain. 
Print hammer timing out of adjustment. ,. 

Particular characters being. worn: most usually the zero. 
Faulty pin on a dot-matrix, head. 


The: design of the, ‘font, produced by the printer is net to. 
‘specification. 


Other causes of rejects are: 


Poor quality paper, whére the ash content is too high (dark 
spots in the paper) 


Printing on the back of the form is “seen” through the paper 
by the scanner. 


There is writing or the image of a stamp over the OCR area 


The format of the data does not match that programmed in the 
OCR 


2.8.4. Cause and Prevention of Mis-reads « A mis-read is 
more serious than a reject because the scanner "believes" 
that it has recognised the character correctly. The causes . 
of mis-reads are basically the same as for rejects but, 
apart from improving the print quality, there is only one 
method of protection against mis-reads. A check-digit mek: 
be added to any key field where an error cannot be 
tolerated. These check-digits are computed arithmetical ly at 
the time the document is being printed and are usually © 
appended to the end of the data Line to be checked. The'' 
scanner will carry out the same calculation and based on the 
result will either accept or reject the document. ie 


2.9. Média Requirements 


Media design is a critical factor in the implementation of a 
successful OCR application. The major areas where care must 
be taken are as follows: \ 


2.9.1. Ink Selection Ink that.must be seen and read by. the 
OCR is called "read" ink. Read ink must be sufficiently 
non-reflective in the Light spectrum of interest to provide 
the necessary contrast for reading. The image detectors Uged 
in most scanners have their highest sensitivity near the | 
infra-red spectrum where eye sensitivity is poor. This makes 
it difficult for the human eye to evaluate the adequacy of” 
an ink for reading. 


Ink is available that is visible to the human eye but is 
invisible to the reader. This “non-read" or “blind” ink can 
be used for marking positions and descriptive materials and 
due to the infra-red sensitivity of the reader, can be made 
in a wide range of colours. 


Typical reflectance curves for various inks and paper are 
shown in the chart on the next page. The spectral region of 
interest for the reader is centred in the infra-red area. 
Note that the blue and red inks have a high reflectance in 
the region of interest and is close to the reflectance of 
white paper. Most black inks have a high carbon content 
which causes high absorbtion (low reflectance) in the 
visible and infra-red region. Most inks and papers suitable 
for OCR applications have been rated by the paper 
manufacturers and technical data is generally available upon 
request. 


ULTRA 





| “4 
| VIOLET tl VISIBLE t INFRA-RED 
ce \ ieee 
: NON ~ READ 
I. ot -. INK 
i | 
J MARGINAL 
NON - READ 
INK 
23 
ab 
a 
MARGINAL 
READ - INK 
READ - INK 





200 300 400 $00 600 700 800 $00 1000 1100 
WAVELENGTH IN NANOMETERS. . 
‘T REGION OF INTEREST?” 
FOR MOST OCR 

2.9.2. Paper Selection The use of a standard OCR grade 
white paper is recommended, Papers are rated in terms of 
reflectance, opacity, consistency, ash content, smoothness 
and gloss. Measurement of the technical paramenters of paper 
is complex, requiring extensive instumentation. A few 
general guideline are given in the following: 


Finish = Paper stock should have a dull or flat finish. 
Rough and high gloss finishes must be avoided. 


Reflectance - paper with less than 70% reflectance in the 
spectrum of interest should be avoided. Reflectance in the 
80% to 90% region is recommended. 


Opacity - This should be considered whenever a light weight 
paper is to be used. In general,.if printing on the back can 
be seen by the eye through the paper, paper opacity is) 
inadequate and poor reading will result. 


Ash - Low cost papers have impurities which may affect 
reading. Ash should have a contrast rating no higher then 
blind ink. 


Caliper - Whenever using paper that is less than 0.08 mm 
thick the designer should consider the user environment, the 
opacity and the ability of ‘the scanner to feed and stack the 
documents. 


Printing Qualities - The relationship of the paper to the 
print and the ink must be considered. The printer 
manufacturer is generally familiar with papers that can be 
used for OCR quality printing. 

Coating + Coating of paper does not normally affect OCR 
reading adversely and, in fact may enhance overal 
acceptability. The major problem with coatings is that 
friction can be increased to such a level that the Paper 
cannot be used in a document or page reader. 


2.10 Print Contrast Signal (PCS) 


The scanner distinguishes the dark.character data from the 
background by the variations in inks and papers resulting in 
variations in optical signal strength. A common term in both 
OCR and Bar-code that relates to signal strength. is Print 
Contrast Signal (PCS). PCS compares the reflectance of an 
ink in relation to the reflectance of the paper. PCS is 

def ined as: Whoa 2 





R@ - RO . 
PCS = x 100 
ae é 
Where RB = Reflectance of background 
RO = Reflectance of printing or blind: ink 


The reflectance jis measured aver the spectral region of 
interest. , 


Good read inks have reflectance values less than 18% and 
good paper has a value greater than 852. 


APPENDIX A 
APN CAustralian Product number) 


The APN code has been established within the framework of 
the EAN (European Article Nymbering) system which compris 
member countries in Western Europe, Japan and Australasia 
The EAN code was itself developed from UPC (Universal 
Product Code) with which it has a one way compatibility a 
the present time. This means that articles bar-coded unde 
the UPC system can be scanned without ambiguity on equipm 
designed for the APN system. However, articles bar-coded 
under the APN/EAN system cannot be scanned on equipment 
currently installed in U.S. stares. Products destined for 


es 


t 
r 
ent 


the U.S. market~-place must, therefore, be coded whith UPC. 


It is expected that this situation will gradually change 
with upgrades to U.S. scanning equipment to allow scannin 
of APN/EAN codes. 


There are numerous numbering systems used in scanning and 
these are summarised below: 
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BAR~CODE TYPE  Secetaetestettettated wrerreee = DIGIT ND) sown cw news nwnnn--- 
; 131 11.10 9 7 6 4 3 2 
Source Number ing . ” 
APN Standard Number 9 3 M MM OM M To bo tr i ot 
EAN-13 (General Form) P1 (P2. & XK KM MK mk 
(See Fig.5) Ee ba ee : tig ee 
UPC General Form >) P xX xX xX xX KR X' KM KX KX 
UPC National Drug (0) 3 x xX x xX x x x x x x 
UPC Grocery Number MO: 0 MM MOM Me LP re oroerr od 
EAN-8 (General Form) (0 Oo @Q 0-. 0) Pt Pe xX Mw KR K X 
APN-8 Standard Form (0 0 0 0 OO 9 3 x xX xX xX xX 
Coupon Numbering. 
UPC Coupon Number 
(Supplier/Group/Value)(0) 5 S$ S$ S$ S$ S$ GCG GCG VV Y¥ 
EAN Coupon Nuaber 9 9 x X¥ ¥* KX MW KX KX KX MX 
EAN Coupon Number with 
Specified Territory 9 8 T x xX k xX X.xX K% XK YX 
In_Store Numbering | F 
UPC Code + Price (0) 2 A A A A A D P P P- Pp 
EAN-13 In-Store Number 2 xX KX % KX Mw XK MW K KR EH YX 
EAN-8 In-Store Number (0 0 0. 0 OF 2 x K ¥. x. XK & 
EAN-8 Velocity Codes* (0 o 0 oO oO 0 x xX k xX -x xX 
References: 
M Manufacturers Number 
I Item Number 
X Free Form Numerics 
C Check Character 
S$ Supplier Number | 
G: Group Number’ . 
Vv. Value 
, 1 Territory 
A Article Number 
0 Price Digit 
P Price 


*# Velocity Code refers to the practice of numbering item 
in the order of turn over so that the fastest movers have 
the least digits to be entered. 

Numbers that are in () are implied as the data ts right 
justified. 





Fig. 5. ASSIGNMENT OF PREFIX DIGITS BY EAN 


00 - 09 (Reserved for UPC) 
20 - 29 In-Store Numbers 
30 - 37 Gencod (France) 

40 - 43 CCG (Germany) 


he DCC (Japan) 

50 ANA (United Kingdom) 
54 ICODIF (Belgium) 
57 DVA (Denmark) 

61 - 62 (Reserved) 

64 CCC (Finland) 

65 - 69 (Reserved) 

70 Norway 

73 Sweden 

76 SAV (Switzerland) 
80 - 83 Italy 

84 AECOC (Spain) 
87 UAC (Netherlands) 
90 - 91 BAN (Austria) 

93 APNA (Australia) 


98 - 99 Coupon: Numbers 


The coding of the digits within the APN symbol is quite 
complex and beyond the scope of this booklet to fully 
define. The following describes, briefly the make up of the 
code and the basic structures used in APN. ALL versions of 
the bar-code have the following characteristics in common: 


Characters in the symbol representing numerical digits are 
made up of 7 Light or dark modules 


In these characters the modules are grouped into bars, with 
each digit represented by 2 bars and 2 spaces. 


A bar or space may comprisé from 1 to 4 modules, 


In addition to the digit characters, there are auxiliary 
characters, comprising fewer modules, used as guard bars or 
centre bars for the start, end and separation. : 


The symbol is designed to be read bi-directionally by a 
fixed position scanner. It can alo be read uni-directional ly 
by a hand held scanner or light pen. 


The symbol size is variable between Limits in magnification, 
to accomodate the ranges in quality achievable by the 
various printing processes. 


Digital values are represented in the bar-code symbols by 
7-module characters arranged in different number sets known 
as a, b, ¢, as shown on the next page: 


—- APPENDIX 3 - 
CODING OF NUMBER CHARACTERS 
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Characters in Number Set A comprise an odd number of dark 
modules which are known as characters with odd parity. 


Characters in Number Sets 8 and C comprise an even number of 
dark modules which are known as characters with even parity. 


Characters in Number Sets A and B always begin on the left 
with a Light module and end on the right with a dark module. 
Characters in Number Set C begin on the left with a dark 
module and end on the right with a Light module.. Taken in 
conjunction with the modules for guard pattern and centre 
pattern, it can be seen that every character in a symbol 
begins and ends with a different module, Light or dark, from 
its neighbour to left or right. This means that the boundary 
between 2 characters can always be visually distinguished, 
which is essential for unamgibuous decoding. 


FORMAT OF 12 CHARACTER BAR-CODES 


APN standard 13-digit numbers, other numbers in the EAN-13° 
series and UPC-A series numbers are all represented by a 12 
character bar~code. This bar-code is made up as follows, 
reading from left to right: 


(1) A normal guard pattern 

(2) 6 digit characters, comprising the left half of the 
symbol, from Number Sets A or 8. 

(3) A centre pattern 

(4) 6 digit characters, comprising the right half of the 
symbol, from Number Set C. 

(5) A normal guard pattern 


NORMAL CENTRE NORM 
GUARD GUARD GUARD 
PATTERN PATTERN PATTERN 
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6LEFTHAND || 6RIGHTHAND 
NUMBER NUMBER 
CHARACTERS CHARACTERS 
WITH VARIABLE WITH FIXED 
PARITY PARITY 
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13th DIGIT 
ENCODED BY 
VARIABLE PARITY 





HUMAN- READABLE 
CHARACTERS IN OCR-B 


The bar-code comprises only 12 digit characters and 
therefore. the 13th digit (leftmost) in the above number 
series is not directly represented but is encoded by 
permutation in the mix of Number Sets A and 6 in the left 
half of the symbol. As all numbers in the APN series 
commence with a prefix digit 9, the mix of Number Sets A and 
B in the left half will be: ABBABA 


UPC BAR-CODE DIFFERENCES 


The 12 character bar-tode used for the source marking of UPC 
numbers is the same in all essentials to that described 
above with only the following minor differences: 


The 13th digit derived from the left hand parity (value O in 
this case) is not shown in human readable characters. 


The OCR-B digit corresponding to the first bar-code number 
character is usually printed to the left of the bar~code. 


The last bar-code character (the check character) is often 
not printed at all in human readable characters. 


The first and Last bar-code number characters are extended 
in depth. 


FORMAT OF 8 CHARACTER BAR-CODE 


This shorter version bar-code is made up as follows, reading 
left to right: 


(1) A normal guard band pattern 

(2) 4 digit characters from Number Set A comprising the left 
half of the symbol 

(3) A centre pattern 

(4) 4 digit characters from Number Set C comprising the 
right half of the symbol 

(S) A normal guard pattern 


NORMAL CENTRE NORMAL 
GUARD GUARD GUARD 
PATTERN PATTERN PATTERN 
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NUMBER NUMBER 
CHARACTERS CHARACTERS 
WITH FIXEO WITH FIXED 
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HUMAN- READABLE 
CHARACTERS IN OCR-B 


There is no parity variation from the fixed pattern in the 
left and right halves of the symbol. 


UPC SUPPLEMENTS 


Many books are printed with a UPC symbol on their back 
cover. This is often in two parts, the left hand part being 


a standard UPC-A symbol and the 
5 digit symbol. This supplement 
things, to indicate price. With 
symbol can only be scanned from 
always the possibility that the 
recognised at all. 


right hand part being a 2 or 
is used, amongst other 

some scanners the total 

left to right and there is 
supplement will not be 
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APPENDIX 8B 
MARK SHEET READERS 


Mark reading evolved directly from the eighty éotunn ‘punched 
card where the punched holes were replaced with marks made 
with a high carbon content pencil. When the card was fed ee 
through a reader small electrical contacts would sense the 
passing of a mark under them and thereby read the data from 
the card. This method worked welt until dirt and carbon 
built up on the contacts rendering them inoperable. 

The usual method of using the card was to pre-punch computér 
generated data and read the punched holes and the user made 
marks back into the system; the ‘first turn around document. 
The modern Optical Mark Reader is a far cry from the old 
Mark Sense Machines, however, the smaller units that are 
manufactured still read a document that. is the same. “s 
dimensions as a punched card. They also, in fact tend. to be 
12 rows high but can read far in excess of 80 columns. 
Larger readers will read pages up to 14 inches in length 
with up to 30 marks being ngad on each rows 


ag 
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The most common application areas for mark, reading are: 
Examination multiple choice answer sheets 
Logging taboratory test results 


Meter reading 


Market surveys ‘he 


ae 


The ideal use is where a number of pre-defined options are 
to be selected by a mark in the appropriate box and. there. is 
only a small amount of numeric data to be filled in by’ the’ 
user. A simple example is shown on the following page. ms 
is a examination card where the rows to be scanned are 
indicated by a black timing mark on the left of the. card. 
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APPENDIX C 
RAGNETIC STRIPE READING | 


The ubiquitous credit card is’ well known to us all. In° =. 
recent years however, a néw magnetic stripe has appeared on 
the back of the card. This stripe is accepted for the access 
it gives us to funds and credit but Little is understood as 
to the format and use of the data therein encoded. The 
following brief description should help in understanding how 
Magnetic Cards can be used in a number of applications. 


C.1. Magnetic Stripe Data Formats 


There are three tracks which*can contain data on the card. 
Each of these tracks‘has a different purpose and format. 


C.1.1. Track 1. This is known as the IATA track and ‘is’ the 
upper track in the stripe. This contains alpha-numeric 
information encoded in a seven bit binary structure 
Cincluding odd parity bit) with a maximum block length of 79 
characters and the following format. 


Start sentinal = "%" 1 character 

Format code = "A" 1 character 

Account number — _up to 19 characters 
Separator = ‘A 1 character °*" 
Surname 4 oe ae 


Surname separator = "/" 
Initials, or first name. 
Separator (when required) 
= "space". 2 to 26 characters 
Title (when used) 
Separator (when required) 


oe "space" . : 

Separator = "4" 1 character 

Discretionary data the balance up to the - 
maximum record length. 
(79 characters) , 

Stop sentinal = "2?" 1 character 

Longitudinal redundancy . 

check 1 character 


Where the format code = “A” the Account number is not 
present. and, therefore the discretionary data can: be 
increased by 19 characters. 


The encoding density of track 1 is 8.3 bits per mm.’ 


C.1.2. Track 2. This is known as the ABA track and is the 
middle track in the stripe. It is this track that is used by 
banks and oil companies through Automatic Teller Machines 
CATMs) and EFTpos terminals. The track contains numeric. 
information encoded in a five bit 8CD structure (including 
an odd parity bit) with a maximum block length of 40 
characters and the following format. 


Start sentinal 1 character 
Account number up to 19 characters 
Separator 1 character 
Discretionary data the balance of the 


maximum record Length 
(40 characters) 


Stop sent inal ‘1 character 
Longitudinal redundancy ; 
check 1 character 


The encoding density for track 2 is 3 bits per mm. 


C.1.3. Track 3 This is known as the THRIFT track and is 
th lowest track in the stripe. The data on track 3 can be 
used in conjunction with that on track 2. This mode of 
operation requires that the original encoded data on track 2 
be read; the data on track three be read; and if update is 
required, all the data on track three be re-written. The 
alpha-numeric information on this track is contained 
in 107 characters including 29 fields. The details of the 
format are quite complex and beyond the scope of this manual. 


The encoding density of track 3.is 8.3 bits per mm. 


Longitudinal Redundancy Check This LRC appears at the end 
of every record in all tracks. This is generated with 
horizontal parity for the data and has its own even parity 
bit. - 


APPENDIX 0 


Serial, Asynchronous Data Transaission 

0.1. Basic Characteristics Serial data transmission is 
characterised by transmitting one data bit at a time between 
two computers. This data flow can follow one of three 
transmission modes: 


Simplex, which only allows data transmission in one 
direction. 


DATAFLOW as 
TRANSMITTER - RECEIVER | 
——" SIMPLEX CONFIGURATION : 


Hal f-duplex, which. allow non=s imul taneous ‘data transmission 
in both directions. 


TRANSMITTER | _ > “DATAFLOW RECEIVER 


& se ee 
RECEIVER - TRANSMITTER 
HALF-DUPLEX . 

CONFIGURATION 





Full-duplex, which allows simultaneous two-way transmissions. 


TRANSMITLER —_ ——- e 
RECEIVER! Vist <..o UMIOEROM. TRANSMITTER 





FULL-DUPLEX ‘ 
CONFIGURATION 


Within the asynchronous data stream, each character of data 
consists of up to 11 binary bits which have the following 
attributes. 


1 - one start bit which tells the receiver to expect 
following data 


2 - between five and eight data data bits which have a 
unique value and are represented in a tahle as an ASCII, 
EBCOIC (7 or 8 bits), ,Transcode (6 bits) or Baudot (5 
bits) code. 


3 = one parity bit which can be Odd, Even, set to "1", set 
to “O" or ignored. 


4 - between one and two stop bits. The slower the : 
transmission speed, the greater the number of stop bits 
required. 


eTsS 
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The transmitting and receiving ends of the serial link must 
both be set to the same rate of transmission. This rate is 
measured in bits per second (bps) or baud rate (where one 
bit indicates one bit of information). Typical asynchronous 
baud rates are 110, 300, 600, 1200, 2400, 4800, 9600 and 
19,200 bps. Higher baud rates can be used but this requires 
an environment with low electrical noise and the use of high 
quality cables. 


0.2. Controlling the transmission. 


When a receiving device is unable to accept any more 
characters there must be a means of telling the sender to 
stop; this is done by the use of a serial flow control. 
Three common serial flow control methods are: 


D.2.1. X-ON/X-OFF protocol When a receiving device nears 
its memory capacity it sends a code (ASCII DC3 - X-OFF) to 
the transmitting device telling it to stop sending data. 
When the receiving device is again able. to accept data it 
then returns an ASCII 0C1 (X-ON) to the transmitting device, 
telling it to resume transmission, 


D.2.2. ENQ/ACK protocol This ia a block oriented protocol 
which sends a fixed block size of characters everytime it 
transmits. Typicatly the transmitting device sends an ENQ 
character and waits for an ACK character to be returned 
before transmitting data. If the receiving device is not 
ready it will return a NAK character. When the ACK has been 
received the entire block is transmitted and the 
transmitting device then resumes polling the Line with ENQs. 


0.2.3. RTS/CTS or DTS handshaking These mnemonics refer to 
physical electrical Lines defined in the RS-232C interface. 
The operation is very simple in that the transmitting device 
will assert the RTS (Request to Send) Line and wait for the. 
receiving device to assert the CTS (Clear to Send) Line. 

When the receiving device cannot accept more data it drops 
the CTS signal until it can again accept data. — 

The OTS (Data Terminal Ready) Line is used in a similar way 
to CTS and will mainly be found in serial printers. 


D.3. ASCII Table 
" (American: Standard Code for ‘Information, Interchange) 


HAR. DEC. HEX CHAR, DEC. HEX “CHAR, DEC, HEX 
» Fe, 91 ‘SB 
a 2 Sie ae 5D 
ty 78... Mes SE 
Ate 76. vee 
120,78 MR, Pe 
Wha 79" "995 °° PD 
pee oa 7E 
AB 30% ec teeee 06 
Bea at. ‘pe 


- F 
- 
= 
a uw. 


95 SE 


wow « ae ¢Craao 
7 
’ — ig »pUS-A@w 


1 MMM F- BODINE We 


Soe Ties 





oe ee. As O1] 
OO ce 3G STO 2 02 
3m [Sua 6 TA 
62 3 «=| SYN 2218 
63. BF. JMS. 31 AF 
96 60 = [VT "44 0B 


A 
8 
c 
D 
E 
F 
H 
I 
J 
K 
L 
" 
N 
0 
p 
Q 
R 
s 
T 
u 
v 
W 
x 
Y 
z 
a 
b 
c 
d 
e 
f 
9 
h 
i 
j 
k 
t 
in 
n 
fe] 
p 


aVNAS 
mo. 
anh 
ol 
2 


st 





D.4. RS-232 Interface. 


This is an EIA standard, applicable to the 25 pin ("D" 
connector) interconnection of Data Terminal Equipment (DTE) 
and Data Communications Equipment (DCE) employing serial. 
binary data communications. This is the most common form of 
connection to most computer systems. The following diagrams 
define the connections for RS-232. 


RS-232 Interface 


OCE TRANSMITTER SIGNAL ELEMENT TIMING 
SECONOARO RECEIVED DATA 
RECEIVER SIGNAL ELEMENT TIMING 


SECONDARY REQUEST TO SEND 
OATA TERMINAL READY 


OATA SIGNAL RATE SELECTOR 
OTE TRANSAATTER SIGNAL ELEMENT THING 





Data Signa! Rate Selector (DTE) 
Data Signal Rate Selector (DCE) 


Transmitter Signal Element Timing (OTE) 
Transmitter Signal Element Timing (DCE) 
Receiver Signal Element Timing (DCE) 


116 | Select Standby 
117 | Standby Indicator 
126 | Select Transmit Frequency 





RS-232 SIGNALS 


MAY HI-TRON BE OF FURTHER ASSISTANCE 


C) Please evaluate the attached forms for use with: 
01 OCR 
OC Bar-code 


CJ Please send me further information on: 
OC Hand held OCR 
O) MICR scanning 
O) Hand Print scanning 
OC Remittance Processing © 
C) Multi-media systems 
() Bar-code scanning 
O1 Magnetic Stripe Reading 
OC) Mark Readers 
D Portable Data Entry 


My area of application is: 
C Retail 0 Financial 0 Manufacturing 0) Cash Receipting 
OC Library 


| have need of this: 
O Immediately [1 30-60 days DL) Within 6 months 


Please have someone contact me at ( ) 
to arrange an appointment 


Comments: 


Name: 
Title: 


Company: 


Address: __. 


Postcode: 
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